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1 Introduction 

Let H be an infinite dimensional separable Hilbert space and let C : H ^ H 
be an injective compact linear operator with non-closed range. We consider 
the ill-posed inverse problem of finding u from data d, where 



and where r] represents noise. The problem (11. ip is called mildly or modestly 
ill-posed if the singular values of the forward mapping C decay algebraically, 
while it is called severely ill-posed if the singular values of C decay exponen- 
tially [6]. Our interest is focussed on the severely ill-posed case, and on the 
small observational noise limit. 

The use of classical (deterministic) regularization methods for (jl.ip , and 
the small-noise limit in particular, is well-studied in both the mildly ill-posed 
[6] and severely ill-posed [9] cases; nonlinear inverse problems have also been 
studied from this perspective [6]. However, if we wish to incorporate infor- 
mation concerning the statistical structure of the noise and solution, then it 
is natural to adopt a Bayesian perspective. The Bayesian approach to linear 
ill-posed inverse problems was adopted in [7j , in which the severely ill-posed 
problem of inverting the heat kernel was considered, and then developed sys- 
tematically in |17l ITB] . More recently, nonlinear inverse problems have been 
given a Bayesian formulation \12\ [20l \T3[ I14| . However, study of the small 
noise limit, known as posterior consistency in the Bayesian context, is an 
under-developed aspect of the Bayesian methodology for inverse problems. 
Our work adds to the growing literature in this area. 

For mildly ill-posed linear problems, subject to Gaussian observational 
noise, Bayesian posterior consistency is considered in the recent papers 
[T1[TT]. In sharp contraction rates are obtained for white observational 
noise when the forward operator C and the prior covariance operator are 
simultaneously diagonalizable; this allows the analysis to proceed through 
the study of an infinite set of uncoupled scalar linear inverse problems. In [1] 
the setting of [H] is generalized to allow for non-white noise and operators 
which are not simultaneously diagonalizable, using tools from PDE theory. 
The paper [10] is, to the best of the authors' knowledge, the first to study 
Bayesian posterior consistency for severely ill-posed problems. It concerns 



d = Cu + rj 
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the one-dimensional backward heat equation with white noise, where the j'th 
eigenvalue of the (self-adjoint) forward mapping decays like and works 
in the simultaneously diagonalizable paradigm of [11]. In this paper, we 
generilize the work in [10] by studying Bayesian posterior consistency for a 
class of severely ill-posed inverse problems in which the jth singular value of 
C decays as e~*-^ for arbitrary positive s and 6, again working in the simul- 
taneously diagonalizable paradigm of [TT] . In addition to the backward heat 
equation considered in [10] (6 = 2), there are a variety of ill-posed inverse 
problems covered by our theory. For instance, the Cauchy problem for the 
Laplace equation and the Cauchy problem for the Helmholtz equation or the 
modified Helmholtz equation (see ^1] and the references therein) : the eigen- 
value decay of the forward mapping for these three examples corresponds 
to 6 = 1. Our analysis is inspired by both the problem and techniques used 
in jlOj : however our generalized setting leads to some technical improve- 
ments in the proofs, we discuss new results relating to the equivalence of the 
prior and posterior and we include a numerical illustration for the Helmholtz 
equation. 

The rest of this paper is organized as follows. In Section [2] we introduce 
notation and give informal calculations for the posterior mean and covari- 
ance operator. In Section [3] we characterize the posterior distribution rigor- 
ously and show that it is equivalent, in the sense of measures, to the prior - 
see Theorems 13.11 and 13.21 In Section 0] we present the main results concern- 
ing posterior consistency, characterizing the error in the mean in Theorem 
14.11 the contraction of the posterior covariance in Theorem 14.21 and putting 
these together to estimate posterior contraction rates in Theorem l4.31 Some 
technical lemmas which are essential to the proof of Theorems 14. H 14.21 and 
14.31 are attached at the end of this section. Section [5] concludes the paper 
with a simple example for which the theoretical analysis can be applied and 
includes a numerical experiment which is consistent with the theory. 

2 Notation and Problem Setting 
2.1 Notation 

Throughout the paper, (•, •) and || • || denote the inner product and norm of 
the Hilbert space H. For a self-adjoint positive operator F, we define the 
weighted inner product and the corresponding norm as follows, 

(•,-)r = (r-i,r-i), ||.||r = ||r-5.||. 
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For two sequences kj and hj of real numbers, kj x hj means that is 

k ■ ^ 

bounded from above and below as j — ?• oo, kj < hj means that jf- is bounded 

from above as j — )• oo, and kj ~ hj means that ^ — )• 1 as j — )• oo. We will 
use M to denote a constant which is different from occurrence to occurrence. 
Let {(pj}"?^^ denote an orthonormal basis in H. Then we can express 

oo 

u £ H as u = '^jfj with Uj = {u, ipj) and for 7 > we can define the 
i=i 

norm ||.||^ by 

00 

We use H^, 7 > to denote the Sobolev-like space 

= {u £ H : \\u\\y < 00}. 

For 7 < 0, we define the spaces by duality: H'^ = {H~'^)* . 

In the following we consider random variables drawn from Gaussian dis- 
rtibutions in H, denoted by N{6, S) where the mean 9 is an element of H 
and the covariance operator S is a positive definite, self-adjoint, trace class, 
linear operator in H. The operator S possesses an infinite set of eigenfunc- 
tions {^j}jeN which correspond to positive eigenvalues {ujljgN and which 
form an orthonormal basis of H. One can express a draw y from N(9,T,) 
using the Karhunen-Loeve expansion as 

y = e + Y,^CjVj, (2.1) 

j 

where ^j are independent and identically distributed A^(0, 1) real random 

variables, [3l[20]. In particular, the expansion coefficients yj = ^/o^£,j are 

II 1 1 2 II 1 1 2 

N{9j,(7j) real variables and it is easy to see that E y = \\9\\ + Tr(S) 
and that for any bounded linear operator T in H, Ty is distributed as 

N{Te,Tj:T*). 

2.2 Bayesian setting and informal charaterization of the pos- 
terior 

In this subsection we describe the assumptions underlying the Bayesian for- 
mulation of the linear inverse problem. Furthermore we provide informal 
calculations which motivate the expressions for the posterior mean and co- 
variance. These will be made precise in Section [3l 
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We place a scaled Gaussian prior on the unknown u of the form fiQ : = 
N{0,T^Co), where r > and Cq is a self-adjoint, positive-definite, trace 
class, linear operator on H. We assume Gaussian observational noise in 
(jl.ip which is independent of u. In particular, we model the data as 

d = £n + ^e, (2.2) 



where is a scale parameter modelling the noise level and is a random 
variable independent of u and distributed as A^(0, Ci). The linear operator Ci 
is assumed to be self-adjoint, positive-definite, bounded, but not necessarily 
trace class on H. This allows for the possibility of having irregular noise 
which is not in H. For example, the case where ^ is white noise corresponds 
to Ci = I, and can be viewed as a Gaussian random variable in for 
r > ^. Under these assumptions, the conditional distribution of d\u, called 
the data likelihood, is the translate of iV(0, Ci) by Cu, which is also Gaussian: 

N(Cu,-Ci). (2.3) 

n 

In finite dimensions the density of the posterior distribution, that is the 
conditional distribution of u\d, is found from Bayes rule to be proportional 
to exp(— $(ii)), where 

^u) = ^\\d-Cu\\l + ^\\u\\l. (2.4) 

This suggests that in our infinite dimensional setting, the posterior distri- 
bution is Gaussian, /x'^ := N{m,C), where the mean m and covariance C can 
be informally derived from (j2.4p using completion of the square: 



C^^ = n/:*C^^C + ^CQ\ (2.5) 

and 

-C-^m = C*C7^d. (2.6) 
n 

Observe that the posterior mean m is the minimizer of the functional 
$(tt). If we define ^q{u) = \^{u) and denote 

A := (2.7) 

then m also minimizes the functional ^q{u), that is, 

m = arg min <I>o(ti), (2-8) 
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where 

$o(«) = i^\\d- Cu\\l^ + -\\u\\l^. 

Thus the posterior mean is a Tikhonov-Phihips regularized solution in the 
classical sense. This reveals the close connection between Bayesian and 
classical regularization for inverse problems. In the deterministic framework, 
A is called the regularization parameter which is carefully chosen in order to 
balance consistency and stability. Similarly, for given inverse noise level n, 
the scale parameter r introduced in the prior can be judiciously chosen to 
guarantee a small error between the posterior mean and the true solution, 
as we will see in Section [H 

Posterior consistency refers, in statistical inverse problems, to studying 
the relationship between the result of the statistical analysis and the truth 
which underlies the data in either the small noise or large data limits; we 
concentrate on the small noise limit. We consider the standard Bayesian 
variant on frequentist posterior consistency [SJ [8] for our severely ill-posed 
inverse problem. To this end we consider observations which are perturba- 
tions of the image of a fixed element tt^ G by a scaled Gaussian additive 
noise, that is, we have data d = S oi the form 

d) = Cv) + (2.9) 

where ^ is a single realization of iV(0,Ci). This choice of data model gives 
the posterior distribution as ^f^^ := N{m\C), where C is given by (j2.5p and 
m) is given by (12. Gh with d = . Similar to the practice in the deterministic 
framework, we assume a-priori known regularity of the true solution and 
identify contraction rates of the posterior to a Dirac measure centered 
on the true solution, as the noise disappears (n — )• oo). 

2.3 Model assumptions 

In this subsection we present our assumptions on the operators appearing 
in our framework, that is, on the forward operator C, the prior covariance 
operator Cq and the noise covariance operator Ci. 

Assumption 2.1. The operators C, Cq and Ci commute with one another, 
so that C*C, Cq and Ci have the same eigenf unctions {ipj^JL^. The corre- 
sponding eigenvalues {tj^JLi, {cQj}JLi and {cij}^^ of C*C, Cq and Ci are 
assumed to satisfy 

/jxexp(-s/), coj=j"2", ci,=j-2/5, (2.10) 
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for s > 0,6 > 0, a > 2)/^ ^ 0- Furthermore, the fixed true solution w 
belongs to H'^ for some 7 > 0. 

Remark 2.2. As is well known infinite dimensions, in the current infinite 
dimensional separable Hilbert-space setting, if C, Cq and Ci commute with 
one another, then C*C, Cq and C\ have the same eigenf unctions {(pj}"^-^ 

IZ3E3/. 

Remark 2.3. One can relax the assumptions on the eigenvalues of Cq and 
Ci to coj X i^^" and cij x J~^^ without affecting any of the subsequent 
results. 



3 Characterization of the Posterior 

In \n\ [T6] it is proved in the infinite dimensional setting that the posterior 
is Gaussian with covariance and mean given by 

C = T^Co - t^CqC*{CCqC* + XCiY^CCq (3.1) 

and 

m = CQC*{CCQC* + \Ci)-^d, (3.2) 

respectively. In the simultaneously diagonalizable case considered here, 
these formulae are equivalent to the formulae (j2.5p and ()2.6p Exam- 
ple 6.23]. Furthermore, since £, Cq and Ci commute with one another, the 
equations (|3.ip and (|3.2p can be rewritten as 

C = T^Co - T^ACCo (3.3) 

and 

m = Ad, (3.4) 
where A : H ^ H is the continuous linear operator 

A = c| {cl£*Ccl + XCiy^cIc* = CqC*{CCqC* + XCiY^. 

In the next two theorems we show that the Gaussian posterior distri- 
bution /i'^, with covariance and mean given by (I3.3p and (I3.4p . is a proper 
conditional Gaussian distribution on H and is absolutely continuous with 
respect to the prior. 
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Theorem 3.1. Suppose Assumption \2. 1\ holds, then: (i) the covariance op- 
erator C of the conditional distribution fi'^ given by iS. 3\) is trace class on 
H; (a) the mean m of the conditional posterior distribution given by \3.4\ j 
is an element of H , almost surely with respect to the joint distribution of 
{u,i). Thus = 1. 

Proof. The fact that ^'^{H) = 1 follows from (i) and (h) is well-known [3]. 
We thus prove these two points. 

(i) Using the basis {v'j}, by equation (|3.3p we have that the eigenvalues 
of C are given by 

~ " co,l] + Xcy - co,l] + Aci, - ^ • ^'-'^ 

Since Cq is trace class on H, it follows that C is trace class on H. 

(ii) From (j3.4p we have that, 

Ellmf = E\\Adf=E\\ACu + ^AC\\'^ 



= E\\ACuf + -EllACf (3.6) 
n 

since ^ and u are independent and ^ has mean zero. The distribution of AS, 
is N(0,ACiA*) and it follows, again working in the basis {<Pj}^i, that 

Ellmf = EllACuf + -TriACiA*) 

n 



{Ihoj + Acij)2 n ^ {Ihoj + Acij^ 



3 

^2 . 

^2 



< oo. 

Hence ||m|| is almost surely finite, which completes the proof. □ 

Theorem 3.2. Suppose Assumption \2.1\ holds, then the posterior measure 
= N{m, C) with covariance and mean given by 113. 3\) and ^3.4^ , respec- 
tively, is equivalent to the prior measure fiQ = A^(0, t^Cq). 
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Proof. By the Feldman-Hajek theorem [H Theorem 2.23], to show that the 
Gaussian measure fi"^ = N{m, C) is equivalent to /xq = A^(0, t^Cq), it suffices 
to show: 

(i) The Cameron-Martin spaces associated with fi'^ and fiQ are equal, 

that is, ^(C~5) = ^(Cq ^) := E. 

(ii) The posterior mean m lies in the Cameron-Martin space E. 

(iii) The operator T = I — t'^C^^CqC^z is Hilbert-Schmidt. 

We now check the validity of the above conditions. For (i) it is equivalent 
to show that there exists a constant M such that 

{h,Ch) < M{h,Coh),yh G H (3.7) 

and 

{h, Coh) < M{h, Ch) , V/i G H- (3.8) 

this follows from |20^ Lemma 6.15] using [H Proposition Bl]. Using the 
eigenbasis expansion, these are equivalent to 

Cj < Mcoj (3.9) 

and 

coj < Mcj. (3.10) 

From ()3.5p . we know that p.9p is true with M = r^. Again by p.Sp . we 
have 

^'''^ - >Mco„ (3.11) 



1 + A-i/2co .c".i 1 + A-1 exp(-2sj^)j2/3-2a 



2 

where M = and is a constant. 

_ i_ 

For (ii), it is easy to check that E = &{Cq ^) = H"". The mean square 
expectation of the posterior mean m in can be estimated similarly to 
(I3TD: 

E||m|||^c« =E\\CQ^'mf = E\\CQ^Adf 



n 
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j J 

2 

— ^ A 

3 



< oo, (3.12) 

therefore m £ E almost surely. 

For (iii), using ([33]) again, we have 

oo 2 9 1 ^ ^ 

E (l - ^) = ^ E -o.^lc- X exp(-4./)/''-- < oo, (3.13) 
j=i 1 j=i j=i 

demonstrating that the operator T is Hilbert-Schmidt. □ 

The preceding result is interesting because, without the assumption that 
the inverse problem is severely ill-posed, it is possible to construct linear 
inverse problems of the form considered in this paper, but for which the 
posterior is not absolutely continuous with respect to the prior. For example, 
suppose that we modify Assumption 12.11 so that the forward operator C 
has singular values that decay algebraically, Ij x but retain the same 
assumptions on the prior and noise covariances. Then the posterior is again 
Gaussian with covariance and mean given by the formulae (j3.ip and (j3.2p . 
The following proposition shows that, if the noise is too smooth, then the 
posterior is not absolutely continuous with respect to the prior: 

Proposition 3.3. IfP>a + £— j then the posterior /i"' = N{m, C) is not 
absolutely continuous with respect to the prior A^(0, t^Cq), independently of 
the data d. 

Proof. It suffices to show that the third condition of the Feldman-Hajek 
theorem fails jH Theorem 2.23]. Indeed, C is diagonalizable in the basis 
{v^jljeN with eigenvalues Cj such that 

j-2a-2/3 
^ j-2f3 _|_ j-2i-2a ■ 

Thus, the operator T := I — t'^C~2CqC^2 is also diagonalizable in {ipj}j£-^ 
with eigenvalues tj, where 

2 

_ , _ ''" Cpj ^ ._2q-2^+2/3 
ij — 1 „ ^ J 
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Hence, the operator T is Hilbert-Schmidt, if and only if the sequence {tj} 
is square summable, that is, if and only if jS < a -\- £ — j. □ 

4 Posterior Contraction 

In this section, we study the limiting behavior of the posterior distribution 
fif^ as the noise disappears, n — )• oo. Intuitively, we expect the mass of the 
posterior to concentrate in a small ball centered on the fixed true solution. 
As in [H [HI [lOl [18], we study this problem by identifying e„ such that, for 
arbitrary positive numbers Mn — )• oo, there holds 

E'^Vfni^^ : h -u^\\> Mnen} ^ 0. (4.1) 

Here expectation is with respect to the random variable d\ with probability 
distribution given by the data likelihood N{Cu^ , ^Ci), and e„ is called the 
contraction rate of the posterior distribution with respect to the i/-norm. 
By the Chebyshev inequality, we have 

E'^Vfn{^:||n-t.t||>M„e„}<-^E'^Y / \\u - u^\^ fif^^idu)) , (4.2) 



thus if 

.''[J ||n-nt||Vf„(dn)) <Mo6^, (4.3) 

where Mq is a constant, we get E,'^^ fif^{u : \\u — u^l > Mnen] — )■ as 
Mn — )• OO. The left hand side of (|4.3p is the squared posterior contraction 
(SPC) which satisfies 

5PC = E'^^||mt-ut||2 + Tr(C), (4.4) 

and therefore, it is enough to estimate the mean integrated squared error 
(MISE) E"^^ \\m) — n^lp and the trace of the posterior covariance operator C. 
By (|33I) we have 

m) = Ad'' = ALu^ + -^Ai. 

Jn 



Meanwhile, 



v) = ACv} + (/ - AC)u 



so that we get the error equation 



e:=rn' -v) = -^A^ + {AC - I)v) . 
Jn 
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The first part of the error comes from the noise, while the second part comes 
from the regularization. Note that for A = formally we have 

indicating that we can make the error small by ensuring that A ^ 1 and 
n ^ 1. Since A = ^^^^i this indicates the possibility of an optimal choice 

of r := Tin) to ensure that A = — r-^i — )• as n — )• oo and to balance the 
two sources of error. In the next three theorems, respectively, we estimate 
the MISE, the trace of the covariance and the SPC. 

Theorem 4.1 (MISE). Under Assumption \2. 1\ the MISE may be estimated 
as follows 

[ < ^{InX 20 b + (InA 2^ b , b<l. 

Proof. Prom the expression above for the error e, since ^ is centred Gaussian, 
we have 

E^^Wm^ -u^\^ = ^E'^^\\A^f +E'^^\\{AC- I)u^f, (4.6) 

from which it follows that 

E'^^Wm^ - u^f = -Tr(^CiX) + \\{AC - I)u^f 
n 

:=I + II. (4.7) 
By Assumption 12.11 it follows that 



1 ^ exp(-25iV^-^° 
" ^ ^ (1 + i exp(-2.j'')j2/5-2a)2 ' 



J 

and 

,t^2 



II. f- 



^(l + iexp(-2sjV/3-2a)2- 
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To estimate I and II we split the sum according to the dominating term 
in the denominator. Define 

F{j;X) :=iexp(-25/)j2/3-2° 

and note that -^(1; A) > 1, for A sufficiently small. Since we are considering 
a limit in which A — > we assume that F{1; A) > 1 henceforth. Let J\ be 
the unique solution of the equation F{j; A) = 1 which exceeds 1. By Lemma 
14.51 we have 

JA~(lnA"i)i (4.8) 

For I, if 1 < j < Ja, 

^exp(-2sjV^"^" < 1 + Yexp(-2s/)i2/3-2a < 2! exp(-2s/)i2/3-2«^ 
A A A 

therefore 



^ exp(-2sj'')j2/5-^° 1 ^ _2/3 

nA2 (1+ iexp(-2si^)i2/3-2a)2 - n 2^ ^^P^^sj )j . (4.9) 



The sum on the right hand side is bounded from above by the integral in 
the same range, and values at both endpoints. By Lemma 14.61 we have 



1 Y: exp(2./)r^^ 
j<Jx 

1 r'^ 



1 11 /" 

< - exp(2sJh Jr^^ + - exp(2s) + - / exp(2sj;hx"2'^dx 
n n n Ji 

= - exp(2s4) jr^^ + - exp(2s) + — exp(2sJ^) jr2^-^+^(l + o(l)) 
n n n 



^ f feM2s4)jf"{l + o{l)), b>l, 

Since 

^ ^ exp(2s/)i-2^ > ^ exp(2s Ja^)J-2^ 
we deduce that for, b > 1, 

i Y: eM2sj')r'^ X iexp(2.4)J-^^ = i-J"^". 

j<Jx 



(4.10) 
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For < 6 < 1 we have 

i J2 exp(2./)r^'^ < -exp(2.4)J-^^-^+^ = -^J^'^"^'. 
j<Jx 

If J > ^A, then 1 < 1 + ^ exp(-2s/)j2/3-2" < 2, thus we have 

nX' (1 + i exp(-2./)j2/^-2")2 " nX^ " 

Under our assumption on A being sufficiently smah, we have that Jx is large 
enough so that exp(— 2sj^)j^^~^" is always decreasing with respect to j and 
hence the sum on the right hand side is bounded from above by the integral 
in the same range, and the value at the left endpoint. By Lemma 14.71 we 
have 



4^ Y: exp(-2./),2/^-^" 
j>J\ 

1 1 f °° 



Jx 



< ^ exp(-2.4) Jf + exp(-2.4) Jf -^"-^+^(1 + o(l)) 



^ exp(-2.Ji) Jf + o(l)), 6 > 1, 

^exp(-2.j|)jf-^-"^+\l + o(l)), 6<1. ^^-''^ 



Since ^ E exp(-2sj'')j2/3-4" > exp(-2s j|) jf for 6 > 1, we 

j>J\ 

have 

and for < b < 1, 



nA^ nA^ nA 



A 



To estimate II, we employ an analysis similar to that applied to I. We 
have 
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= E j'^^)'A2exp(4./)/"-^^-27. (4.12) 
j<Jx 

For A small enough, the terms exp(4sj'')j^"~^^~^'^ for 1 < j < Jx are domi- 
nated by exp(4sJ^) J^"~^'^~^'^, so we have the following upper bound for the 
sum (|i7[2D : 



E J^^(4)^A2exp(4./)/--^/^-2^ < A2exp(4.J^)J, 
Furthermore 

E j^^(nt)2A2exp(4.j'')j4-^/^-2^ > (nt.J^A^ exp(4.4) "^^"^^ 



implying that, since 7 > and u £ H'^, 

E f^u]fX'expiisj'^)f'^-'^-'^ X A2exp(4.J^)jf-^'^-^- = J-'\ 
j<Jx 

(4.13) 

The other part of the sum II satisfies 
It follows that 

J2f^n]fr'^>.J-'\ (4.14) 

since u S -ff'. 

Combining (j4.6p - (j4.14p completes the proof. □ 

Theorem 4.2 (Trace of C). Let Assumption \2. 1\ hold and consider the pos- 
terior covariance operator C given by i2.5\) . with A as in \2.10\j . Then the 
trace is estimated as 

Tr(C) X ^(InA-^)-^. (4.15) 
nX 

Proof. From (|3.3p . we have 

+ ^^1.- " ^ 1 + i exp(-2./)j2^-2" • ^ • 
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As in the proof of Theorem of 14.11 we spht the sum according to the dom- 
inating term in the denominator. For the first part, using Lemma 14.61 we 
have 



1 o— 2o 1 

-,i:TTT;d=^;]^--nT.M)r^^. (4.17) 



For 6 > 1, 

i ^ exp(2.j'')r^'^ X iexp(2.4)J,"2^ = i-J'^", 
n ^-^ n n\ 

j<Jx 

and for < 6 < 1, 

i Y: exp(2./)r^'^ < iexp(2.J^)J-^''-^+^ = -^J^'^"^'. 

j<Jx 

The other part of the sum on the right hand side of (|4.16p satisfies 

— V i"'" X — V - 

nX 1 + i exp(-2sj^)j2/5-2" nX ^ 
By \10\ Lemma 6.2], the last sum can be estimated as 

j>Jx 



2a 



hence 

J_ \ I ^ J_ T-2a + l (A -, 

nA^^^ l + iexp(-2s/)j2/3-2" ,iA^ ' ^' ' 



Combining (gj]), ([iT6]) - (jiT8]) completes the proof. □ 

We combine the two preceding theorems to determine the overall con- 
traction rate. 

Theorem 4.3 (Rate of Contraction). Suppose that Assumption {2^11 holds, 
X is given by \2. 7| j and T{n) > satisfies nr'^^n) — )• oo. Then the posterior 
distribution /i^ ,^ contracts around the true solution at the rate 

en = {\n{nT^)y^ +T{\n{nT^)) ^. (4.19) 
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In particular 



Inn) , r = 1, , , 

^7 1 (4-20) 

, n~2+'^ ^ ■T" ^ (Inn) 6 , 

where a > is some constant. 

Proof. The estimate (j4.19p follows by combining ()4.4p . Theorem 14.11 and 
Theorem 14.21 To obtain (j4.20p . we optimize by choosing (In (nr^))"" 



b 



r(ln(nT2)) —. □ 

Remark 4.4. 

i) The rate of the MISE is determined by the regularity of the prior a, the 
regularity of the truth 7 and the degree of ill-posedness as determined by the 
power b in the eigenvalues of C (s does not affect the rate). On the other 
hand, the rate of the trace of the posterior covariance is determined by a and 
b and has nothing to do with the regularity of the truth 7. Finally the rate of 
contraction is determined by a, 7 and b. Observe that the regularity of the 
noise, j3, does not affect the rate. In the case of mildly ill-posed problems 
where the singular values of C decay algebraically (3 does appear in the error 
estimates, but only through the difference in regularity between the forward 
operator and the noise covariance For our severely ill-posed problem this 
difference may be thought of as being infinite, explaining why f3 disappears 
from the error estimates here. 

a) For fixed r = 1, the rate of contraction is ^Inn^ '' , that is, as 7 

grows the rate improves until 7 = a — |, at which point the rate saturates 

1 

Inn) . On the contrary, for ?i- 2+°" < r < (Inn) b the rate is 

(lnn)~b and never saturates. 

Hi) For the appropriate choice of t = T{n) the contraction rate is €n = 
(lnn)~ 6 , which for (3 = I or 2 is optimal in the minimax sense with L^-loss 
11123/. 

We conclude the section with several technical lemmas used in the proof 
of the preceding theorems. 
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Lemma 4.5. Let a,b > and t £ M be constants. For all A sufficiently 
small the equation 

^exp(-arE^)2;* = 1, (4.21) 
A 

has a unique solution J\ in {x > 1} and J\ ~ (InA^a)b as A — t- 0. 
Proof. Uniqueness of a root in {x > 1} follows automatically provided 

A~^ exp(— a) > 1, 

since x i— )• exp(— ax*)x* has at most one maximum in {x > 0}. From (|4.'2ip . 
it is easy to see that 

^ In A~ a t In J\ 

_i 

Since x > 1, J^— )-ooasA— )-0, thus we have 1 ~ '"'^g, ° , which completes 
the proof. □ 

Lemma 4.6. For a > 0, 6 > and c G M, we have as J ^ oo, 

j\Max'')x''dx ~ exp(aJ^)J^-^+i. (4.22) 

Proof. By variable substitution x^ = y and integration by parts, we have 
rJ 



i: 



exp(ax )x'^dx 

■J" 



= ^(exp(aj'')J'= - exp(o)) - ^ ^ J exp(ay)y' '"^'dy. 

(4.23) 

Since 

iiEi5f^<exp(„fe-j'))J- 

exp(aJ°)J^ 

if we divide both sides of (j4.23p by exp(aJ^) J'^"''"'"^, then the integral on 
the right hand side tends to zero as J tends to infinity by the monotone 
convergence theorem. We have as J — >• oo 

^ exp{ax^)x''dx ~ exp(aj'') J^-^+\ 
which completes the proof. □ 
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Lemma 4.7. For J>0,a>0, 6>0 and c EM. we have 

exp(-ax^)x^dx < exp(-oJ^) J=-*+i. (4.24) 



/; 



ij 

Proof. Similar to the proof of Lemma 14.61 we have 

POO 

J exp{—ax^)x'^dx 

1 r Tb\ TC-b+1 C-b+l c-2b+l 

= — exp(-aJ'')J^ + -2 — / ex.p{-ay)y b dy. 

ao ao"^ J Jb 

If ^"^j"-*^ > 0, then we integrate by parts for n times until '^~2b'^^ ^ ^ 
the first time. When the constant in front of the integral finally becomes 
negative we can ignore the integral on the right hand side to get 

roo 1 

/ eM-ax^)x"dx < — exp(-aj'')(J"-^+i(l + o(l))). 
J J ah 

□ 



5 Example 

In this section, we present the Cauchy problem for the Helmholtz equation 
as an example to which the theoretical analysis of this paper can be applied. 
For simplicity, we only consider the small wave number case (0 < A; < 1) for 
illustration. For more details regarding the more general case, we refer to 

m- 

Consider the following boundary value problem for the Helmholtz equa- 
tion: 

{ Av{x,y) + k^v{x,y) =0, (x, y) G (0, vr) x (0, 1), 

Vy{x,0)=0, x£[0,ir], 

v{x,l)=u{x), 2;G[0, vr], 

t v{0,y)=v{Tr,y)=0, ye [0,1]. 

Problem (jS.lh is well-posed since it corresponds to inversion of a negative- 
definite ellipic operator with mixed Dirichlet/Neumann data. In fact, by the 
method of separation of variables, the solution v{x, y ) in domain (0, vr) x (0, 1) 
can be expressed as 
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where ipj{x) = y § sin(jx) and Uj = {u, (pj). 

Define the forward mapping £ : &{C) C -L^(0,7r) — ^ L^(0,7r) by 

oo ^ 

Cu{x) = v{x, 0) = ■Tijipj^r). 

■ ^^ COsh(Vj2 - 

which maps the boundary data of ()5.ip on y = 1 into the solution on y = 0. 
Then £ is a self-adjoint, positive-definite, linear operator, with eigenvalues 
behaving as 

cosh(Y^j2 — A;2) 

The inverse problem is to find the function u, given noisy observations 
of f(-,0). More precisely the data d is given by 



If we place a Gaussian measure A^(0, t^Cq) as prior on u and assume that 
is also Gaussian A^(0,Ci), then we may apply the theory developed in this 
paper. Under Assumption 12 . 1 1, Theorem 14.31 can be applied to this problem 
with 6=1 and s = 1 to obtain the contraction rate of the conditional 
Gaussian posterior distribution. 

We now present a numerical simulation for obtaining the rate of the 
MISE as the noise disappears {n — ?• oo), when a = 2,7 = 1 and we have a 
fixed r = 1. In this case, our theory predicts that 

MISE X (ln(VH))~'^"''^^ = (ln(VH))~'. 

To simulate MISE we average the error over a thousand realizations of the 
noise for n = lO'', k = 1, 100. We denote the simulated MISE by MISE. 
The true solution u"^ G H'^' is a fixed draw from a Gaussian measure N{0, E), 
where S has eigenvalues aj = , for e = 10"^*^. We use the first 10^ 

Fourier modes. In Figure [T] we plot — |ln(MISE) against ln(ln(-y/n)) in 
the case /3 = 0. The solid line is the relation predicted by Theorem 14. H 
that is, a line with slope 1. A least square fit to the simulated points gives 
a slope of 1.0341 with coefficient of determination 0.9884. In Figure [2] we 
have (3 = 2 and all the other parameters the same. The least squares fit 
gives a slope 0.9723 with coefficient of determination 0.9916, confirming that 
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1 2 3 4 5 6 

l.l(l.<yS)) 

Figure 1: -i In (MISE) plotted against ln{ln{y/^)) for n = 10'=, = 1, 100 in 
the case 6 = s = 1, q = 2, /3 = 0, 7 = 1, for fixed r = 1. 




ln(h<VS)) 

Figure 2: -i In (MISE) plotted against In (ln(V?i)) for n = 10*=, fc = 1, 100 in 
the case 6 = s = 1, a = 2, /3 = 2, 7 = 1, for fixed r = 1. 
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the regularity of the noise as determined by (3 does not affect the rate of 
convergence. 

Acknowledgments. The authors are grateful to Bartek Knapik and 
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